A Comparison of Prediction Accuracy , Complexity , and Training Time of Thirty - three Old and New Classi cation Algorithms
نویسنده
چکیده
Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classi cation accuracy, training time, and (in the case of trees) number of leaves. Classi cation accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, spline-based, algorithm called Polyclass at the top, although it is not statistically signi cantly di erent from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is Quest with linear splits, which ranks fourth and fth, respectively. Although spline-based statistical algorithms tend to have good accuracy, they also require relatively long training times. Polyclass, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The Quest and logistic regression algorithms are substantially faster. Among decision tree algorithms with univariate splits, C4.5, Ind-Cart, and Quest have the best combinations of error rate and speed. But C4.5 tends to produce trees with twice as many leaves as those from Ind-Cart and Quest.
منابع مشابه
Comparison of Artificial Neural Network Training Algorithms for Predicting the Weight of Kurdi Sheep using Image Processing
Extended Abstract Introduction and Objective: Due to weakness, the occurrence of unwanted errors, the impact of the environment and exposure to natural events, human always make mistakes in their diagnoses of the environment or different topics, so that different people 's perception of a single and unique event may be very different and be diverse. Nowadays, with the development of image proc...
متن کاملLow Complexity Speaker Authentication Techniques Using Polynomial Classi ers
Modern authentication systems require high-accuracy low complexity methods. High accuracy ensures secure access to sensitive data. Low computational requirements produce high transaction rates for large authentication populations. We propose a polynomial-based classi cation system that combines high-accuracy and low complexity using discriminative techniques. Traditionally polynomial classi ers...
متن کاملPersonal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)
Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...
متن کاملFast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard
three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...
متن کاملA New High-order Takagi-Sugeno Fuzzy Model Based on Deformed Linear Models
Amongst possible choices for identifying complicated processes for prediction, simulation, and approximation applications, high-order Takagi-Sugeno (TS) fuzzy models are fitting tools. Although they can construct models with rather high complexity, they are not as interpretable as first-order TS fuzzy models. In this paper, we first propose to use Deformed Linear Models (DLMs) in consequence pa...
متن کامل